154 research outputs found
An approach to graph-based analysis of textual documents
In this paper a new graph-based model is proposed for the representation of textual documents. Graph-structures are obtained from textual documents by making use of the well-known Part-Of-Speech (POS) tagging technique. More specifically, a simple rule-based (re) classifier is used to map each tag onto graph vertices and edges. As a result, a decomposition of textual documents is obtained where tokens are automatically parsed and attached to either a vertex or an edge. It is shown how textual documents can be aggregated through their graph-structures and finally, it is shown how vertex-ranking methods can be used to find relevant tokens.(1)
Flexible information retrieval: some research trends
In this paper some research trends in the field of Information Retrieval are presented. The focus is on the definition of flexible systems, i.e. systems that can represent and manage the vagueness and uncertainty which is characteristic of the process of information searching and retrieval. In this paper the application of soft computing techniques is considered, in particular fuzzy set theory
Fuzzy order-sorted feature logic
Order-Sorted Feature (OSF) logic is a knowledge representation and reasoning
language based on function-denoting feature symbols and set-denoting sort
symbols ordered in a subsumption lattice. OSF logic allows the construction of
record-like terms that represent classes of entities and that are themselves
ordered in a subsumption relation. The unification algorithm for such
structures provides an efficient calculus of type subsumption, which has been
applied in computational linguistics and implemented in constraint logic
programming languages such as LOGIN and LIFE and automated reasoners such as
CEDAR. This work generalizes OSF logic to a fuzzy setting. We give a flexible
definition of a fuzzy subsumption relation which generalizes Zadeh's inclusion
between fuzzy sets. Based on this definition we define a fuzzy semantics of OSF
logic where sort symbols and OSF terms denote fuzzy sets. We extend the
subsumption relation to OSF terms and prove that it constitutes a fuzzy partial
order with the property that two OSF terms are subsumed by one another in the
crisp sense if and only if their subsumption degree is greater than 0. We show
how to find the greatest lower bound of two OSF terms by unifying them and how
to compute the subsumption degree between two OSF terms, and we provide the
complexity of these operations.Comment: Accepted for publication in Fuzzy Sets and System
Personalization in BERT with Adapter Modules and Topic Modelling
As a result of the widespread use of intelligent assistants, personalization in dialogue systems has become
a hot topic in both research and industry. Typically, training such systems is computationally expensive,
especially when using recent large language models. To address this challenge, we develop an approach
to personalize dialogue systems using adapter layers and topic modelling. Our implementation enables
the model to incorporate user-specific information, achieving promising results by training only a small
fraction of parameters
SE-PQA: Personalized Community Question Answering
Personalization in Information Retrieval is a topic studied for a long time.
Nevertheless, there is still a lack of high-quality, real-world datasets to
conduct large-scale experiments and evaluate models for personalized search.
This paper contributes to filling this gap by introducing SE-PQA (StackExchange
- Personalized Question Answering), a new curated resource to design and
evaluate personalized models related to the task of community Question
Answering (cQA). The contributed dataset includes more than 1 million queries
and 2 million answers, annotated with a rich set of features modeling the
social interactions among the users of a popular cQA platform. We describe the
characteristics of SE-PQA and detail the features associated with questions and
answers. We also provide reproducible baseline methods for the cQA task based
on the resource, including deep learning models and personalization approaches.
The results of the preliminary experiments conducted show the appropriateness
of SE-PQA to train effective cQA models; they also show that personalization
remarkably improves the effectiveness of all the methods tested. Furthermore,
we show the benefits in terms of robustness and generalization of combining
data from multiple communities for personalization purposes
Utilizing ChatGPT to Enhance Clinical Trial Enrollment
Clinical trials are a critical component of evaluating the effectiveness of
new medical interventions and driving advancements in medical research.
Therefore, timely enrollment of patients is crucial to prevent delays or
premature termination of trials. In this context, Electronic Health Records
(EHRs) have emerged as a valuable tool for identifying and enrolling eligible
participants. In this study, we propose an automated approach that leverages
ChatGPT, a large language model, to extract patient-related information from
unstructured clinical notes and generate search queries for retrieving
potentially eligible clinical trials. Our empirical evaluation, conducted on
two benchmark retrieval collections, shows improved retrieval performance
compared to existing approaches when several general-purposed and task-specific
prompts are used. Notably, ChatGPT-generated queries also outperform
human-generated queries in terms of retrieval performance. These findings
highlight the potential use of ChatGPT to enhance clinical trial enrollment
while ensuring the quality of medical service and minimizing direct risks to
patients.Comment: Under Revie
A laboratory-based method for the evaluation of personalised search
Comparative evaluation of Information Retrieval Systems
(IRSs) using publically available test collections has become
an established practice in Information Retrieval (IR). By
means of the popular Cranfield evaluation paradigm IR test
collections enable researchers to compare new methods to
existing approaches. An important area of IR research where
this strategy has not been applied to date is Personalised
Information Retrieval (PIR), which has generally relied on
user-based evaluations. This paper describes a method that
enables the creation of publically available extended test collections to allow repeatable laboratory-based evaluation of
personalised search
- …